Clustering Data with Categorical Relationships
نویسندگان
چکیده
data aims at considering numeric data, categorical data or a mixture of both. For such data, the concentration was finding a relationship between the data points to be clustered. The relationships were limited to being either binary or fuzzy. Both involved a numeric value called distance or any other similarity measure between two data points and cluster them together if found similar. With time, a new kind of relationship called categorical relationship was observed between data points, far different from the traditionally seen ones. This paper focuses on handling data points having categorical relationships and the techniques emerged till date in this direction.
منابع مشابه
ارائه یک الگوریتم خوشه بندی برای داده های دسته ای با ترکیب معیارها
Clustering is one of the main techniques in data mining. Clustering is a process that classifies data set into groups. In clustering, the data in a cluster are the closest to each other and the data in two different clusters have the most difference. Clustering algorithms are divided into two categories according to the type of data: Clustering algorithms for numerical data and clustering algor...
متن کاملClustering Numerical and Categorical Data
Clustering is an important technique for data mining which allows us to discover unknown relationships in our data sets. Clustering algorithms that use metrics based on the natural ordering of numbers cannot be applied to categorical (non-numerical) data. In this tutorial we will review the main methods for numerical data clustering (K-Means, Hierarchical Clustering and Fuzzy CMeans) and then s...
متن کاملخوشهبندی خودکار دادههای مختلط با استفاده از الگوریتم ژنتیک
In the real world clustering problems, it is often encountered to perform cluster analysis on data sets with mixed numeric and categorical values. However, most existing clustering algorithms are only efficient for the numeric data rather than the mixed data set. In addition, traditional methods, for example, the K-means algorithm, usually ask the user to provide the number of clusters. In this...
متن کاملData Mining with Semantic Features Represented as Vectors of Semantic Clusters
Data mining with taxonomies merged with categorical data has been studied in the past but often limited to small taxonomies. Taxonomies are used to aggregate categorical data such that patterns induced from the data can be expressed at higher levels of conceptual generality. Semantic similarity and relatedness measures can be used to aggregate categorical values for cluster based data mining al...
متن کاملA Simple Yet Fast Clustering Approach for Categorical Data
Categorical data has always posed a challenge in data analysis through clustering. With the increasing awareness about Big data analysis, the need for better clustering methods for categorical data and mixed data has arisen. The prevailing clustering algorithms are not suitable for clustering categorical data majorly because the distance functions used for continuous data are not applicable for...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016